Búsqueda | Portal Regional de la BVS

1.

Robust Pose Transfer With Dynamic Details Using Neural Video Rendering.

Sun, Yang-Tian; Huang, Hao-Zhi; Wang, Xuan; Lai, Yu-Kun; Liu, Wei; Gao, Lin.

IEEE Trans Pattern Anal Mach Intell ; 45(2): 2660-2666, 2023 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-35412977

RESUMEN

Pose transfer of human videos aims to generate a high-fidelity video of a target person imitating actions of a source person. A few studies have made great progress either through image translation with deep latent features or neural rendering with explicit 3D features. However, both of them rely on large amounts of training data to generate realistic results, and the performance degrades on more accessible Internet videos due to insufficient training frames. In this paper, we demonstrate that the dynamic details can be preserved even when trained from short monocular videos. Overall, we propose a neural video rendering framework coupled with an image-translation-based dynamic details generation network (D 2 G-Net), which fully utilizes both the stability of explicit 3D features and the capacity of learning components. To be specific, a novel hybrid texture representation is presented to encode both the static and pose-varying appearance characteristics, which is then mapped to the image space and rendered as a detail-rich frame in the neural rendering stage. Through extensive comparisons, we demonstrate that our neural human video renderer is capable of achieving both clearer dynamic details and more robust performance even on accessible short videos with only 2 k â¼ 4 k frames, as illustrated in Fig. 1.

2.

Temporally Coherent Video Harmonization Using Adversarial Networks.

Huang, Hao-Zhi; Xu, Sen-Zhe; Cai, Jun-Xiong; Liu, Wei; Hu, Shi-Min.

IEEE Trans Image Process ; 29: 214-224, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-31331884

RESUMEN

Compositing is one of the most important editing operations for images and videos. The process of improving the realism of composite results is often called harmonization. Previous approaches for harmonization mainly focus on images. In this paper, we take one step further to attack the problem of video harmonization. Specifically, we train a convolutional neural network in an adversarial way, exploiting a pixel-wise disharmony discriminator to achieve more realistic harmonized results and introducing a temporal loss to increase temporal consistency between consecutive harmonized frames. Thanks to the pixel-wise disharmony discriminator, we are also able to relieve the need of input foreground masks. Since existing video datasets which have ground-truth foreground masks and optical flows are not sufficiently large, we propose a simple yet efficient method to build up a synthetic dataset supporting supervised training of the proposed adversarial network. The experiments show that training on our synthetic dataset generalizes well to the real-world composite dataset. In addition, our method successfully incorporates temporal consistency during training and achieves more harmonious visual results than previous methods.

3.

Efficient, Edge-Aware, Combined Color Quantization and Dithering.

Huang, Hao-Zhi; Xu, Kun; Martin, Ralph R; Huang, Fei-Yue; Hu, Shi-Min.

IEEE Trans Image Process ; 25(3): 1152-62, 2016 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-26731765

RESUMEN

In this paper, we present a novel algorithm to simultaneously accomplish color quantization and dithering of images. This is achieved by minimizing a perception-based cost function, which considers pixel-wise differences between filtered versions of the quantized image and the input image. We use edge aware filters in defining the cost function to avoid mixing colors on the opposite sides of an edge. The importance of each pixel is weighted according to its saliency. To rapidly minimize the cost function, we use a modified multi-scale iterative conditional mode (ICM) algorithm, which updates one pixel a time while keeping other pixels unchanged. As ICM is a local method, careful initialization is required to prevent termination at a local minimum far from the global one. To address this problem, we initialize ICM with a palette generated by a modified median-cut method. Compared with previous approaches, our method can produce high-quality results with a fewer visual artifacts but also requires significantly less computational effort.

4.

Faithful Completion of Images of Scenic Landmarks Using Internet Images.

Zhu, Zhe; Huang, Hao-Zhi; Tan, Zhi-Peng; Xu, Kun; Hu, Shi-Min.

IEEE Trans Vis Comput Graph ; 22(8): 1945-58, 2016 08.

Artículo en Inglés | MEDLINE | ID: mdl-26394429

RESUMEN

Previous works on image completion typically aim to produce visually plausible results rather than factually correct ones. In this paper, we propose an approach to faithfully complete the missing regions of an image. We assume that the input image is taken at a well-known landmark, so similar images taken at the same location can be easily found on the Internet. We first download thousands of images from the Internet using a text label provided by the user. Next, we apply two-step filtering to reduce them to a small set of candidate images for use as source images for completion. For each candidate image, a co-matching algorithm is used to find correspondences of both points and lines between the candidate image and the input image. These are used to find an optimal warp relating the two images. A completion result is obtained by blending the warped candidate image into the missing region of the input image. The completion results are ranked according to combination score, which considers both warping and blending energy, and the highest ranked ones are shown to the user. Experiments and results demonstrate that our method can faithfully complete images.

5.

Active Exploration of Large 3D Model Repositories.

Gao, Lin; Cao, Yan-Pei; Lai, Yu-Kun; Huang, Hao-Zhi; Kobbelt, Leif; Hu, Shi-Min.

IEEE Trans Vis Comput Graph ; 21(12): 1390-402, 2015 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-26529460

RESUMEN

With broader availability of large-scale 3D model repositories, the need for efficient and effective exploration becomes more and more urgent. Existing model retrieval techniques do not scale well with the size of the database since often a large number of very similar objects are returned for a query, and the possibilities to refine the search are quite limited. We propose an interactive approach where the user feeds an active learning procedure by labeling either entire models or parts of them as "like" or "dislike" such that the system can automatically update an active set of recommended models. To provide an intuitive user interface, candidate models are presented based on their estimated relevance for the current query. From the methodological point of view, our main contribution is to exploit not only the similarity between a query and the database models but also the similarities among the database models themselves. We achieve this by an offline pre-processing stage, where global and local shape descriptors are computed for each model and a sparse distance metric is derived that can be evaluated efficiently even for very large databases. We demonstrate the effectiveness of our method by interactively exploring a repository containing over 100 K models.

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA